169 research outputs found
Generating and Sampling Orbits for Lifted Probabilistic Inference
A key goal in the design of probabilistic inference algorithms is identifying
and exploiting properties of the distribution that make inference tractable.
Lifted inference algorithms identify symmetry as a property that enables
efficient inference and seek to scale with the degree of symmetry of a
probability model. A limitation of existing exact lifted inference techniques
is that they do not apply to non-relational representations like factor graphs.
In this work we provide the first example of an exact lifted inference
algorithm for arbitrary discrete factor graphs. In addition we describe a
lifted Markov-Chain Monte-Carlo algorithm that provably mixes rapidly in the
degree of symmetry of the distribution
Probabilistic Program Abstractions
Abstraction is a fundamental tool for reasoning about complex systems.
Program abstraction has been utilized to great effect for analyzing
deterministic programs. At the heart of program abstraction is the relationship
between a concrete program, which is difficult to analyze, and an abstract
program, which is more tractable. Program abstractions, however, are typically
not probabilistic. We generalize non-deterministic program abstractions to
probabilistic program abstractions by explicitly quantifying the
non-deterministic choices. Our framework upgrades key definitions and
properties of abstractions to the probabilistic context. We also discuss
preliminary ideas for performing inference on probabilistic abstractions and
general probabilistic programs
Symbolic Exact Inference for Discrete Probabilistic Programs
The computational burden of probabilistic inference remains a hurdle for
applying probabilistic programming languages to practical problems of interest.
In this work, we provide a semantic and algorithmic foundation for efficient
exact inference on discrete-valued finite-domain imperative probabilistic
programs. We leverage and generalize efficient inference procedures for
Bayesian networks, which exploit the structure of the network to decompose the
inference task, thereby avoiding full path enumeration. To do this, we first
compile probabilistic programs to a symbolic representation. Then we adapt
techniques from the probabilistic logic programming and artificial intelligence
communities in order to perform inference on the symbolic representation. We
formalize our approach, prove it sound, and experimentally validate it against
existing exact and approximate inference techniques. We show that our inference
approach is competitive with inference procedures specialized for Bayesian
networks, thereby expanding the class of probabilistic programs that can be
practically analyzed
Overfitting in Synthesis: Theory and Practice (Extended Version)
In syntax-guided synthesis (SyGuS), a synthesizer's goal is to automatically
generate a program belonging to a grammar of possible implementations that
meets a logical specification. We investigate a common limitation across
state-of-the-art SyGuS tools that perform counterexample-guided inductive
synthesis (CEGIS). We empirically observe that as the expressiveness of the
provided grammar increases, the performance of these tools degrades
significantly.
We claim that this degradation is not only due to a larger search space, but
also due to overfitting. We formally define this phenomenon and prove
no-free-lunch theorems for SyGuS, which reveal a fundamental tradeoff between
synthesizer performance and grammar expressiveness.
A standard approach to mitigate overfitting in machine learning is to run
multiple learners with varying expressiveness in parallel. We demonstrate that
this insight can immediately benefit existing SyGuS tools. We also propose a
novel single-threaded technique called hybrid enumeration that interleaves
different grammars and outperforms the winner of the 2018 SyGuS competition
(Inv track), solving more problems and achieving a mean speedup.Comment: 24 pages (5 pages of appendices), 7 figures, includes proofs of
theorem
Recommended from our members
Don't mind the gap: Bridging network-wide objectives and device-level configurations
We reflect on the historical context that lead to Propane, a high-level language and compiler to help network operators bridge the gap between network-wide routing objectives and low-level configurations of devices that run complex, distributed protocols. We also highlight the primary contributions that Propane made to the networking literature and describe ongoing challenges. We conclude with an important lesson learned from the experience
FlashProfile: A Framework for Synthesizing Data Profiles
We address the problem of learning a syntactic profile for a collection of
strings, i.e. a set of regex-like patterns that succinctly describe the
syntactic variations in the strings. Real-world datasets, typically curated
from multiple sources, often contain data in various syntactic formats. Thus,
any data processing task is preceded by the critical step of data format
identification. However, manual inspection of data to identify the different
formats is infeasible in standard big-data scenarios.
Prior techniques are restricted to a small set of pre-defined patterns (e.g.
digits, letters, words, etc.), and provide no control over granularity of
profiles. We define syntactic profiling as a problem of clustering strings
based on syntactic similarity, followed by identifying patterns that succinctly
describe each cluster. We present a technique for synthesizing such profiles
over a given language of patterns, that also allows for interactive refinement
by requesting a desired number of clusters.
Using a state-of-the-art inductive synthesis framework, PROSE, we have
implemented our technique as FlashProfile. Across tasks over large
real datasets, we observe a median profiling time of only s.
Furthermore, we show that access to syntactic profiles may allow for more
accurate synthesis of programs, i.e. using fewer examples, in
programming-by-example (PBE) workflows such as FlashFill.Comment: 28 pages, SPLASH (OOPSLA) 201
The Silently Shifting Semicolon
Memory consistency models for modern concurrent languages have largely been designed from a system-centric point of view that protects, at all costs, optimizations that were originally designed for sequential programs. The result is a situation that, when viewed from a programmer\u27s standpoint, borders on absurd. We illustrate this unfortunate situation with a brief fable and then examine the opportunities to right our path
What do LLMs need to Synthesize Correct Router Configurations?
We investigate whether Large Language Models (e.g., GPT-4) can synthesize
correct router configurations with reduced manual effort. We find GPT-4 works
very badly by itself, producing promising draft configurations but with
egregious errors in topology, syntax, and semantics. Our strategy, that we call
Verified Prompt Programming, is to combine GPT-4 with verifiers, and use
localized feedback from the verifier to automatically correct errors.
Verification requires a specification and actionable localized feedback to be
effective. We show results for two use cases: translating from Cisco to Juniper
configurations on a single router, and implementing no-transit policy on
multiple routers. While human input is still required, if we define the
leverage as the number of automated prompts to the number of human prompts, our
experiments show a leverage of 10X for Juniper translation, and 6X for
implementing no-transit policy, ending with verified configurations
- …